Comparison of Statistical Data Models for Identifying Differentially Expressed Genes Using a Generalized Likelihood Ratio Test

نویسندگان

  • Kok-Yong Seng
  • Robb W. Glenny
  • David K. Madtes
  • Mary E. Spilker
  • Paolo Vicini
  • Sina A. Gharib
چکیده

Currently, statistical techniques for analysis of microarray-generated data sets have deficiencies due to limited understanding of errors inherent in the data. A generalized likelihood ratio (GLR) test based on an error model has been recently proposed to identify differentially expressed genes from microarray experiments. However, the use of different error structures under the GLR test has not been evaluated, nor has this method been compared to commonly used statistical tests such as the parametric t-test. The concomitant effects of varying data signal-to-noise ratio and replication number on the performance of statistical tests also remain largely unexplored. In this study, we compared the effects of different underlying statistical error structures on the GLR test's power in identifying differentially expressed genes in microarray data. We evaluated such variants of the GLR test as well as the one sample t-test based on simulated data by means of receiver operating characteristic (ROC) curves. Further, we used bootstrapping of ROC curves to assess statistical significance of differences between the areas under the curves. Our results showed that i) the GLR tests outperformed the t-test for detecting differential gene expression, ii) the identity of the underlying error structure was important in determining the GLR tests' performance, and iii) signal-to-noise ratio was a more important contributor than sample replication in identifying statistically significant differential gene expression.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Testing for Differentially-Expressed Genes by Maximum-Likelihood Analysis of Microarray Data

Although two-color fluorescent DNA microarrays are now standard equipment in many molecular biology laboratories, methods for identifying differentially expressed genes in microarray data are still evolving. Here, we report a refined test for differentially expressed genes which does not rely on gene expression ratios but directly compares a series of repeated measurements of the two dye intens...

متن کامل

A generalized likelihood ratio test to identify differentially expressed genes from microarray data

MOTIVATION Microarray technology emerges as a powerful tool in life science. One major application of microarray technology is to identify differentially expressed genes under various conditions. Currently, the statistical methods to analyze microarray data are generally unsatisfactory, mainly due to the lack of understanding of the distribution and error structure of microarray data. RESULTS...

متن کامل

Robust Modeling of Differential Gene Expression Data Using Normal/Independent Distributions: A Bayesian Approach

In this paper, the problem of identifying differentially expressed genes under different conditions using gene expression microarray data, in the presence of outliers, is discussed. For this purpose, the robust modeling of gene expression data using some powerful distributions known as normal/independent distributions is considered. These distributions include the Student's t and normal distrib...

متن کامل

Statistical methods for identifying differentially Expressed genes in microarray data

Microarray is a recently developed functional genomic technology that has powerful applications in a wide array of biological research areas, including the medical sciences, agriculture, biotechnology and environmental studies. One of the important problems in the analysis of microarray data is the identification of differentially expressed genes. Commonly used approaches for identifying differ...

متن کامل

rSeqDiff: Detecting Differential Isoform Expression from RNA-Seq Data Using Hierarchical Likelihood Ratio Test

High-throughput sequencing of transcriptomes (RNA-Seq) has recently become a powerful tool for the study of gene expression. We present rSeqDiff, an efficient algorithm for the detection of differential expression and differential splicing of genes from RNA-Seq experiments across multiple conditions. Unlike existing approaches which detect differential expression of transcripts, our approach co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2008